Overview

Dataset statistics

Number of variables16
Number of observations700
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory87.6 KiB
Average record size in memory128.2 B

Variable types

Categorical6
Numeric9
DateTime1

Warnings

Gross Sales is highly correlated with Sales and 1 other fieldsHigh correlation
Sales is highly correlated with Gross Sales and 1 other fieldsHigh correlation
COGS is highly correlated with Gross Sales and 1 other fieldsHigh correlation
Country is uniformly distributed Uniform
Discounts has 53 (7.6%) zeros Zeros

Reproduction

Analysis started2021-02-24 10:41:27.118940
Analysis finished2021-02-24 10:41:42.130377
Duration15.01 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Segment
Categorical

Distinct5
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
Government
300 
Enterprise
100 
Channel Partners
100 
Small Business
100 
Midmarket
100 

Length

Max length16
Median length10
Mean length11.28571429
Min length9

Characters and Unicode

Total characters7900
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGovernment
2nd rowGovernment
3rd rowMidmarket
4th rowMidmarket
5th rowMidmarket
ValueCountFrequency (%)
Government300
42.9%
Enterprise100
 
14.3%
Channel Partners100
 
14.3%
Small Business100
 
14.3%
Midmarket100
 
14.3%
2021-02-24T17:41:42.238050image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-24T17:41:42.276211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
government300
33.3%
enterprise100
 
11.1%
small100
 
11.1%
partners100
 
11.1%
channel100
 
11.1%
business100
 
11.1%
midmarket100
 
11.1%

Most occurring characters

ValueCountFrequency (%)
e1200
15.2%
n1100
13.9%
r800
10.1%
t600
 
7.6%
m500
 
6.3%
s500
 
6.3%
a400
 
5.1%
G300
 
3.8%
o300
 
3.8%
v300
 
3.8%
Other values (14)1900
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6800
86.1%
Uppercase Letter900
 
11.4%
Space Separator200
 
2.5%

Most frequent character per category

ValueCountFrequency (%)
e1200
17.6%
n1100
16.2%
r800
11.8%
t600
8.8%
m500
7.4%
s500
7.4%
a400
 
5.9%
o300
 
4.4%
v300
 
4.4%
i300
 
4.4%
Other values (6)800
11.8%
ValueCountFrequency (%)
G300
33.3%
M100
 
11.1%
C100
 
11.1%
P100
 
11.1%
E100
 
11.1%
S100
 
11.1%
B100
 
11.1%
ValueCountFrequency (%)
200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7700
97.5%
Common200
 
2.5%

Most frequent character per script

ValueCountFrequency (%)
e1200
15.6%
n1100
14.3%
r800
10.4%
t600
 
7.8%
m500
 
6.5%
s500
 
6.5%
a400
 
5.2%
G300
 
3.9%
o300
 
3.9%
v300
 
3.9%
Other values (13)1700
22.1%
ValueCountFrequency (%)
200
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII7900
100.0%

Most frequent character per block

ValueCountFrequency (%)
e1200
15.2%
n1100
13.9%
r800
10.1%
t600
 
7.6%
m500
 
6.3%
s500
 
6.3%
a400
 
5.1%
G300
 
3.8%
o300
 
3.8%
v300
 
3.8%
Other values (14)1900
24.1%

Country
Categorical

UNIFORM

Distinct5
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
United States of America
140 
Germany
140 
Canada
140 
France
140 
Mexico
140 

Length

Max length24
Median length6
Mean length9.8
Min length6

Characters and Unicode

Total characters6860
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCanada
2nd rowGermany
3rd rowFrance
4th rowGermany
5th rowMexico
ValueCountFrequency (%)
United States of America140
20.0%
Germany140
20.0%
Canada140
20.0%
France140
20.0%
Mexico140
20.0%
2021-02-24T17:41:42.380562image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-24T17:41:42.417258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
united140
12.5%
of140
12.5%
states140
12.5%
mexico140
12.5%
germany140
12.5%
france140
12.5%
america140
12.5%
canada140
12.5%

Most occurring characters

ValueCountFrequency (%)
a980
14.3%
e840
12.2%
n560
 
8.2%
r420
 
6.1%
c420
 
6.1%
i420
 
6.1%
t420
 
6.1%
420
 
6.1%
d280
 
4.1%
m280
 
4.1%
Other values (12)1820
26.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5460
79.6%
Uppercase Letter980
 
14.3%
Space Separator420
 
6.1%

Most frequent character per category

ValueCountFrequency (%)
a980
17.9%
e840
15.4%
n560
10.3%
r420
7.7%
c420
7.7%
i420
7.7%
t420
7.7%
d280
 
5.1%
m280
 
5.1%
o280
 
5.1%
Other values (4)560
10.3%
ValueCountFrequency (%)
C140
14.3%
G140
14.3%
F140
14.3%
M140
14.3%
U140
14.3%
S140
14.3%
A140
14.3%
ValueCountFrequency (%)
420
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6440
93.9%
Common420
 
6.1%

Most frequent character per script

ValueCountFrequency (%)
a980
15.2%
e840
13.0%
n560
 
8.7%
r420
 
6.5%
c420
 
6.5%
i420
 
6.5%
t420
 
6.5%
d280
 
4.3%
m280
 
4.3%
o280
 
4.3%
Other values (11)1540
23.9%
ValueCountFrequency (%)
420
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII6860
100.0%

Most frequent character per block

ValueCountFrequency (%)
a980
14.3%
e840
12.2%
n560
 
8.2%
r420
 
6.1%
c420
 
6.1%
i420
 
6.1%
t420
 
6.1%
420
 
6.1%
d280
 
4.1%
m280
 
4.1%
Other values (12)1820
26.5%

Product
Categorical

Distinct6
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
Paseo
202 
VTT
109 
Velo
109 
Amarilla
94 
Carretera
93 

Length

Max length9
Median length5
Mean length5.732857143
Min length3

Characters and Unicode

Total characters4013
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCarretera
2nd rowCarretera
3rd rowCarretera
4th rowCarretera
5th rowCarretera
ValueCountFrequency (%)
Paseo202
28.9%
VTT109
15.6%
Velo109
15.6%
Amarilla94
13.4%
Carretera93
13.3%
Montana93
13.3%
2021-02-24T17:41:42.526110image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-24T17:41:42.564588image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
paseo202
28.9%
vtt109
15.6%
velo109
15.6%
amarilla94
13.4%
carretera93
13.3%
montana93
13.3%

Most occurring characters

ValueCountFrequency (%)
a762
19.0%
e497
12.4%
o404
10.1%
r373
9.3%
l297
 
7.4%
V218
 
5.4%
T218
 
5.4%
P202
 
5.0%
s202
 
5.0%
t186
 
4.6%
Other values (6)654
16.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3095
77.1%
Uppercase Letter918
 
22.9%

Most frequent character per category

ValueCountFrequency (%)
a762
24.6%
e497
16.1%
o404
13.1%
r373
12.1%
l297
 
9.6%
s202
 
6.5%
t186
 
6.0%
n186
 
6.0%
m94
 
3.0%
i94
 
3.0%
ValueCountFrequency (%)
V218
23.7%
T218
23.7%
P202
22.0%
A94
10.2%
C93
10.1%
M93
10.1%

Most occurring scripts

ValueCountFrequency (%)
Latin4013
100.0%

Most frequent character per script

ValueCountFrequency (%)
a762
19.0%
e497
12.4%
o404
10.1%
r373
9.3%
l297
 
7.4%
V218
 
5.4%
T218
 
5.4%
P202
 
5.0%
s202
 
5.0%
t186
 
4.6%
Other values (6)654
16.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII4013
100.0%

Most frequent character per block

ValueCountFrequency (%)
a762
19.0%
e497
12.4%
o404
10.1%
r373
9.3%
l297
 
7.4%
V218
 
5.4%
T218
 
5.4%
P202
 
5.0%
s202
 
5.0%
t186
 
4.6%
Other values (6)654
16.3%

Discount Band
Categorical

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
High
245 
Medium
242 
Low
160 
None
53 

Length

Max length6
Median length4
Mean length4.462857143
Min length3

Characters and Unicode

Total characters3124
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone
ValueCountFrequency (%)
High245
35.0%
Medium242
34.6%
Low160
22.9%
None53
 
7.6%
2021-02-24T17:41:42.681718image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-24T17:41:42.717805image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
high245
35.0%
medium242
34.6%
low160
22.9%
none53
 
7.6%

Most occurring characters

ValueCountFrequency (%)
i487
15.6%
e295
9.4%
H245
7.8%
g245
7.8%
h245
7.8%
M242
7.7%
d242
7.7%
u242
7.7%
m242
7.7%
o213
6.8%
Other values (4)426
13.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2424
77.6%
Uppercase Letter700
 
22.4%

Most frequent character per category

ValueCountFrequency (%)
i487
20.1%
e295
12.2%
g245
10.1%
h245
10.1%
d242
10.0%
u242
10.0%
m242
10.0%
o213
8.8%
w160
 
6.6%
n53
 
2.2%
ValueCountFrequency (%)
H245
35.0%
M242
34.6%
L160
22.9%
N53
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
Latin3124
100.0%

Most frequent character per script

ValueCountFrequency (%)
i487
15.6%
e295
9.4%
H245
7.8%
g245
7.8%
h245
7.8%
M242
7.7%
d242
7.7%
u242
7.7%
m242
7.7%
o213
6.8%
Other values (4)426
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3124
100.0%

Most frequent character per block

ValueCountFrequency (%)
i487
15.6%
e295
9.4%
H245
7.8%
g245
7.8%
h245
7.8%
M242
7.7%
d242
7.7%
u242
7.7%
m242
7.7%
o213
6.8%
Other values (4)426
13.6%

Units Sold
Real number (ℝ≥0)

Distinct510
Distinct (%)72.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1608.294286
Minimum200
Maximum4492.5
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:42.775501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum200
5-th percentile344.95
Q1905
median1542.5
Q32229.125
95-th percentile2935.95
Maximum4492.5
Range4292.5
Interquartile range (IQR)1324.125

Descriptive statistics

Standard deviation867.4278591
Coefficient of variation (CV)0.5393464783
Kurtosis-0.3153179967
Mean1608.294286
Median Absolute Deviation (MAD)655.5
Skewness0.4361535619
Sum1125806
Variance752431.0907
MonotocityNot monotonic
2021-02-24T17:41:42.841172image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7275
 
0.7%
19164
 
0.6%
14964
 
0.6%
6634
 
0.6%
28444
 
0.6%
17434
 
0.6%
3863
 
0.4%
13663
 
0.4%
21453
 
0.4%
11233
 
0.4%
Other values (500)663
94.7%
ValueCountFrequency (%)
2001
0.1%
2142
0.3%
2181
0.1%
2412
0.3%
2451
0.1%
ValueCountFrequency (%)
4492.51
0.1%
42511
0.1%
4243.51
0.1%
4219.51
0.1%
40261
0.1%

Manufacturing Price
Real number (ℝ≥0)

Distinct6
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.47714286
Minimum3
Maximum260
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:42.890731image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q15
median10
Q3250
95-th percentile260
Maximum260
Range257
Interquartile range (IQR)245

Descriptive statistics

Standard deviation108.6026122
Coefficient of variation (CV)1.125682301
Kurtosis-1.42896268
Mean96.47714286
Median Absolute Deviation (MAD)7
Skewness0.5925839516
Sum67534
Variance11794.52737
MonotocityNot monotonic
2021-02-24T17:41:42.942304image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
10202
28.9%
120109
15.6%
250109
15.6%
26094
13.4%
393
13.3%
593
13.3%
ValueCountFrequency (%)
393
13.3%
593
13.3%
10202
28.9%
120109
15.6%
250109
15.6%
ValueCountFrequency (%)
26094
13.4%
250109
15.6%
120109
15.6%
10202
28.9%
593
13.3%

Sale Price
Real number (ℝ≥0)

Distinct7
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean118.4285714
Minimum7
Maximum350
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:42.989659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile7
Q112
median20
Q3300
95-th percentile350
Maximum350
Range343
Interquartile range (IQR)288

Descriptive statistics

Standard deviation136.7755146
Coefficient of variation (CV)1.154919906
Kurtosis-1.176789008
Mean118.4285714
Median Absolute Deviation (MAD)13
Skewness0.7712818708
Sum82900
Variance18707.54139
MonotocityNot monotonic
2021-02-24T17:41:43.041347image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
7100
14.3%
12100
14.3%
15100
14.3%
20100
14.3%
125100
14.3%
300100
14.3%
350100
14.3%
ValueCountFrequency (%)
7100
14.3%
12100
14.3%
15100
14.3%
20100
14.3%
125100
14.3%
ValueCountFrequency (%)
350100
14.3%
300100
14.3%
125100
14.3%
20100
14.3%
15100
14.3%

Gross Sales
Real number (ℝ≥0)

HIGH CORRELATION

Distinct550
Distinct (%)78.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean182759.4264
Minimum1799
Maximum1207500
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:43.102268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1799
5-th percentile5193.05
Q117391.75
median37980
Q3279025
95-th percentile754250
Maximum1207500
Range1205701
Interquartile range (IQR)261633.25

Descriptive statistics

Standard deviation254262.2844
Coefficient of variation (CV)1.391240328
Kurtosis2.054300583
Mean182759.4264
Median Absolute Deviation (MAD)30747.5
Skewness1.673921656
Sum127931598.5
Variance6.464930926 × 1010
MonotocityNot monotonic
2021-02-24T17:41:43.174642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
828753
 
0.4%
261453
 
0.4%
227103
 
0.4%
370503
 
0.4%
7380003
 
0.4%
44043
 
0.4%
185402
 
0.3%
426602
 
0.3%
4116002
 
0.3%
72482
 
0.3%
Other values (540)674
96.3%
ValueCountFrequency (%)
17991
0.1%
18412
0.3%
19602
0.3%
20511
0.1%
25202
0.3%
ValueCountFrequency (%)
12075001
0.1%
11407501
0.1%
11380501
0.1%
10485001
0.1%
10381002
0.3%

Discounts
Real number (ℝ≥0)

ZEROS

Distinct515
Distinct (%)73.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13150.35463
Minimum0
Maximum149677.5
Zeros53
Zeros (%)7.6%
Memory size5.6 KiB
2021-02-24T17:41:43.241731image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1800.32
median2585.25
Q315956.34375
95-th percentile62832
Maximum149677.5
Range149677.5
Interquartile range (IQR)15156.02375

Descriptive statistics

Standard deviation22962.92877
Coefficient of variation (CV)1.746183234
Kurtosis7.905712444
Mean13150.35463
Median Absolute Deviation (MAD)2328.57
Skewness2.685038938
Sum9205248.24
Variance527296097.9
MonotocityNot monotonic
2021-02-24T17:41:43.311484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
053
 
7.6%
56903
 
0.4%
1218.63
 
0.4%
201393
 
0.4%
6822.52
 
0.3%
295382
 
0.3%
26632
 
0.3%
132442
 
0.3%
1037.72
 
0.3%
4826.252
 
0.3%
Other values (505)626
89.4%
ValueCountFrequency (%)
053
7.6%
18.411
 
0.1%
25.341
 
0.1%
44.731
 
0.1%
48.151
 
0.1%
ValueCountFrequency (%)
149677.51
0.1%
1258201
0.1%
1197562
0.3%
1158301
0.1%
112927.51
0.1%

Sales
Real number (ℝ≥0)

HIGH CORRELATION

Distinct559
Distinct (%)79.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean169609.0718
Minimum1655.08
Maximum1159200
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:43.378187image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1655.08
5-th percentile4535.65
Q115928
median35540.2
Q3261077.5
95-th percentile683574.75
Maximum1159200
Range1157544.92
Interquartile range (IQR)245149.5

Descriptive statistics

Standard deviation236726.3469
Coefficient of variation (CV)1.395717484
Kurtosis2.188633088
Mean169609.0718
Median Absolute Deviation (MAD)28716.7
Skewness1.696295217
Sum118726350.3
Variance5.603936332 × 1010
MonotocityNot monotonic
2021-02-24T17:41:43.449368image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
836002
 
0.3%
16748.552
 
0.3%
2605802
 
0.3%
206852.52
 
0.3%
15056.722
 
0.3%
30072.482
 
0.3%
4280.42
 
0.3%
9868112
 
0.3%
10575.722
 
0.3%
1685.62
 
0.3%
Other values (549)680
97.1%
ValueCountFrequency (%)
1655.081
0.1%
1685.62
0.3%
1730.541
0.1%
1763.861
0.1%
1822.591
0.1%
ValueCountFrequency (%)
11592001
0.1%
1038082.51
0.1%
1035625.51
0.1%
10173382
0.3%
9868112
0.3%

COGS
Real number (ℝ≥0)

HIGH CORRELATION

Distinct545
Distinct (%)77.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean145475.2114
Minimum918
Maximum950625
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:43.519680image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum918
5-th percentile2449.5
Q17490
median22506.25
Q3245607.5
95-th percentile608643.75
Maximum950625
Range949707
Interquartile range (IQR)238117.5

Descriptive statistics

Standard deviation203865.5061
Coefficient of variation (CV)1.401376249
Kurtosis1.608462973
Mean145475.2114
Median Absolute Deviation (MAD)19576.25
Skewness1.549047562
Sum101832648
Variance4.156114458 × 1010
MonotocityNot monotonic
2021-02-24T17:41:43.701839image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
174304
 
0.6%
151403
 
0.4%
795603
 
0.4%
6150003
 
0.4%
86553
 
0.4%
247003
 
0.4%
11013
 
0.4%
5452502
 
0.3%
1587502
 
0.3%
66302
 
0.3%
Other values (535)672
96.0%
ValueCountFrequency (%)
9181
 
0.1%
11013
0.4%
11582
0.3%
12302
0.3%
12851
 
0.1%
ValueCountFrequency (%)
9506251
0.1%
9483751
0.1%
8970001
0.1%
8737501
0.1%
7711602
0.3%

Profit
Real number (ℝ)

Distinct557
Distinct (%)79.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24133.86037
Minimum-40617.5
Maximum262200
Zeros5
Zeros (%)0.7%
Memory size5.6 KiB
2021-02-24T17:41:43.769322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-40617.5
5-th percentile-7849.25
Q12805.96
median9242.2
Q322662
95-th percentile117138.1
Maximum262200
Range302817.5
Interquartile range (IQR)19856.04

Descriptive statistics

Standard deviation42760.62656
Coefficient of variation (CV)1.771810473
Kurtosis8.678616216
Mean24133.86037
Median Absolute Deviation (MAD)7383.075
Skewness2.712151264
Sum16893702.26
Variance1828471184
MonotocityNot monotonic
2021-02-24T17:41:43.839509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05
 
0.7%
390722
 
0.3%
2216.742
 
0.3%
7584.52
 
0.3%
68782
 
0.3%
20420.42
 
0.3%
4773.252
 
0.3%
3010.82
 
0.3%
6802.082
 
0.3%
11474.42
 
0.3%
Other values (547)677
96.7%
ValueCountFrequency (%)
-40617.51
0.1%
-38046.251
0.1%
-355501
0.1%
-35262.51
0.1%
-33522.51
0.1%
ValueCountFrequency (%)
2622001
0.1%
2475001
0.1%
2461782
0.3%
2387912
0.3%
2367162
0.3%

Date
Date

Distinct16
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
Minimum2013-09-01 00:00:00
Maximum2014-12-01 00:00:00
2021-02-24T17:41:43.898714image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:43.956840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=16)

Month Number
Real number (ℝ≥0)

Distinct12
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.9
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size5.6 KiB
2021-02-24T17:41:44.012697image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.95
Q15.75
median9
Q310.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)4.5

Descriptive statistics

Standard deviation3.37732064
Coefficient of variation (CV)0.4275089418
Kurtosis-0.8791598187
Mean7.9
Median Absolute Deviation (MAD)2.5
Skewness-0.5782921541
Sum5530
Variance11.40629471
MonotocityNot monotonic
2021-02-24T17:41:44.064878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
10140
20.0%
12105
15.0%
670
10.0%
970
10.0%
1170
10.0%
135
 
5.0%
235
 
5.0%
335
 
5.0%
435
 
5.0%
535
 
5.0%
Other values (2)70
10.0%
ValueCountFrequency (%)
135
5.0%
235
5.0%
335
5.0%
435
5.0%
535
5.0%
ValueCountFrequency (%)
12105
15.0%
1170
10.0%
10140
20.0%
970
10.0%
835
 
5.0%

Month Name
Categorical

Distinct12
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
October
140 
December
105 
November
70 
September
70 
June
70 
Other values (7)
245 

Length

Max length9
Median length7
Mean length6.6
Min length3

Characters and Unicode

Total characters4620
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJanuary
2nd rowJanuary
3rd rowJune
4th rowJune
5th rowJune
ValueCountFrequency (%)
October140
20.0%
December105
15.0%
November70
10.0%
September70
10.0%
June70
10.0%
February35
 
5.0%
July35
 
5.0%
May35
 
5.0%
April35
 
5.0%
March35
 
5.0%
Other values (2)70
10.0%
2021-02-24T17:41:44.184260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
october140
20.0%
december105
15.0%
november70
10.0%
june70
10.0%
september70
10.0%
january35
 
5.0%
july35
 
5.0%
august35
 
5.0%
february35
 
5.0%
april35
 
5.0%
Other values (2)70
10.0%

Most occurring characters

ValueCountFrequency (%)
e910
19.7%
r560
12.1%
b420
 
9.1%
c280
 
6.1%
u245
 
5.3%
m245
 
5.3%
t245
 
5.3%
o210
 
4.5%
a175
 
3.8%
J140
 
3.0%
Other values (16)1190
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3920
84.8%
Uppercase Letter700
 
15.2%

Most frequent character per category

ValueCountFrequency (%)
e910
23.2%
r560
14.3%
b420
10.7%
c280
 
7.1%
u245
 
6.2%
m245
 
6.2%
t245
 
6.2%
o210
 
5.4%
a175
 
4.5%
y140
 
3.6%
Other values (8)490
12.5%
ValueCountFrequency (%)
J140
20.0%
O140
20.0%
D105
15.0%
M70
10.0%
A70
10.0%
S70
10.0%
N70
10.0%
F35
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4620
100.0%

Most frequent character per script

ValueCountFrequency (%)
e910
19.7%
r560
12.1%
b420
 
9.1%
c280
 
6.1%
u245
 
5.3%
m245
 
5.3%
t245
 
5.3%
o210
 
4.5%
a175
 
3.8%
J140
 
3.0%
Other values (16)1190
25.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII4620
100.0%

Most frequent character per block

ValueCountFrequency (%)
e910
19.7%
r560
12.1%
b420
 
9.1%
c280
 
6.1%
u245
 
5.3%
m245
 
5.3%
t245
 
5.3%
o210
 
4.5%
a175
 
3.8%
J140
 
3.0%
Other values (16)1190
25.8%

Year
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2014
525 
2013
175 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2800
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2014
2nd row2014
3rd row2014
4th row2014
5th row2014
ValueCountFrequency (%)
2014525
75.0%
2013175
 
25.0%
2021-02-24T17:41:44.296040image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-24T17:41:44.329075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2014525
75.0%
2013175
 
25.0%

Most occurring characters

ValueCountFrequency (%)
2700
25.0%
0700
25.0%
1700
25.0%
4525
18.8%
3175
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2800
100.0%

Most frequent character per category

ValueCountFrequency (%)
2700
25.0%
0700
25.0%
1700
25.0%
4525
18.8%
3175
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Common2800
100.0%

Most frequent character per script

ValueCountFrequency (%)
2700
25.0%
0700
25.0%
1700
25.0%
4525
18.8%
3175
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII2800
100.0%

Most frequent character per block

ValueCountFrequency (%)
2700
25.0%
0700
25.0%
1700
25.0%
4525
18.8%
3175
 
6.2%

Interactions

2021-02-24T17:41:36.448905image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:36.530910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:36.607636image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:36.664656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:36.753883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:36.842004image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:36.908825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.113379image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.187298image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.251506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.322732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.392136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.476616image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:37.884375image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.097104image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.167255image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.229026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.290918image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.352285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.413241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.474721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.536165image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.597726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.661055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.719809image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.781443image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:38.935244image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.004118image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.064055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.315336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.372547image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.431644image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.488509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.544097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.603049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.661034image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.721163image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.784386image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.845816image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.904167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:39.959952image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.014274image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.071729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.128463image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.185571image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.245640image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.302680image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.360110image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.420599image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.471726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.526659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.579990image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.633211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.686953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.740145image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.793379image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.856805image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.917919image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:40.974399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.031662image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.088381image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.146614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.204916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.258615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.314925image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.367405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.420510image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.481707image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.618865image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.675696image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.730420image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-24T17:41:41.782926image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-02-24T17:41:44.371171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-24T17:41:44.467093image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-24T17:41:44.556412image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-24T17:41:44.651249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-02-24T17:41:44.740785image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-24T17:41:41.965065image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-24T17:41:42.077147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

SegmentCountryProductDiscount BandUnits SoldManufacturing PriceSale PriceGross SalesDiscountsSalesCOGSProfitDateMonth NumberMonth NameYear
0GovernmentCanadaCarreteraNone1618.532032370.00.032370.016185.016185.02014-01-011January2014
1GovernmentGermanyCarreteraNone1321.032026420.00.026420.013210.013210.02014-01-011January2014
2MidmarketFranceCarreteraNone2178.031532670.00.032670.021780.010890.02014-06-016June2014
3MidmarketGermanyCarreteraNone888.031513320.00.013320.08880.04440.02014-06-016June2014
4MidmarketMexicoCarreteraNone2470.031537050.00.037050.024700.012350.02014-06-016June2014
5GovernmentGermanyCarreteraNone1513.03350529550.00.0529550.0393380.0136170.02014-12-0112December2014
6MidmarketGermanyMontanaNone921.051513815.00.013815.09210.04605.02014-03-013March2014
7Channel PartnersCanadaMontanaNone2518.051230216.00.030216.07554.022662.02014-06-016June2014
8GovernmentFranceMontanaNone1899.052037980.00.037980.018990.018990.02014-06-016June2014
9Channel PartnersGermanyMontanaNone1545.051218540.00.018540.04635.013905.02014-06-016June2014

Last rows

SegmentCountryProductDiscount BandUnits SoldManufacturing PriceSale PriceGross SalesDiscountsSalesCOGSProfitDateMonth NumberMonth NameYear
690GovernmentUnited States of AmericaVTTHigh267.0250205340.0801.004539.002670.01869.002013-10-0110October2013
691MidmarketGermanyVTTHigh1175.02501517625.02643.7514981.2511750.03231.252014-10-0110October2014
692EnterpriseCanadaVTTHigh2954.0250125369250.055387.50313862.50354480.0-40617.502013-11-0111November2013
693EnterpriseGermanyVTTHigh552.025012569000.010350.0058650.0066240.0-7590.002014-11-0111November2014
694GovernmentFranceVTTHigh293.0250205860.0879.004981.002930.02051.002014-12-0112December2014
695Small BusinessFranceAmarillaHigh2475.0260300742500.0111375.00631125.00618750.012375.002014-03-013March2014
696Small BusinessMexicoAmarillaHigh546.0260300163800.024570.00139230.00136500.02730.002014-10-0110October2014
697GovernmentMexicoMontanaHigh1368.0579576.01436.408139.606840.01299.602014-02-012February2014
698GovernmentCanadaPaseoHigh723.01075061.0759.154301.853615.0686.852014-04-014April2014
699Channel PartnersUnited States of AmericaVTTHigh1806.02501221672.03250.8018421.205418.013003.202014-05-015May2014